{ "cells": [ { "cell_type": "markdown", "id": "fc399b10-900a-427f-9291-a0a547699198", "metadata": { "editable": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "# Other options of deployment using `Streamlit` and `Heroku` -- Platform as a Service\n", "\n", "## ML Deployment using Streamlit" ] }, { "cell_type": "markdown", "id": "bef365fe-bf18-4d7d-a5c4-2eb2ca78e4b5", "metadata": { "editable": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### Deployment using Streamlit\n" ] }, { "cell_type": "markdown", "id": "12184d9c-82be-4f00-8757-e6770a1311b2", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "Streamlit\n", "- Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science.\n", "- In just a few minutes you can build and deploy powerful data apps. So [let's get started!](https://docs.streamlit.io/library/get-started/)" ] }, { "cell_type": "markdown", "id": "77723450-a86f-4f34-b61c-f1daba864c3a", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "Deployment of IRIS plan classification:\n", "\n", "- Using [IRIS dataset](https://www.kaggle.com/datasets/uciml/iris)\n", "- The Iris dataset was used in R.A. Fisher's classic 1936 paper.\n", "- The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.\n", "\n", "![IRIS Image](Images/Docker/IRIS_dataset.jpg)" ] }, { "cell_type": "markdown", "id": "32705a7c-a550-4d02-9468-d99d44fcd092", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "- It includes three iris species with 50 samples each as well as some properties about each flower.\n", "- One flower species is linearly separable from the other two, but the other two are not linearly separable from each other. \n", "\n", "The columns in this dataset are:\n", "\n", " Id\n", " SepalLengthCm\n", " SepalWidthCm\n", " PetalLengthCm\n", " PetalWidthCm\n", " Species" ] }, { "cell_type": "markdown", "id": "f10b35c6-3e3d-42f3-9056-3b106a414af5", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "#### Training the model" ] }, { "cell_type": "markdown", "id": "9f441830-339a-4db8-9a4f-6a6f835f5cb3", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Import the data" ] }, { "cell_type": "code", "execution_count": 1, "id": "b6555231-a316-4d29-8ca7-342dfc598b91", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IdSepalLengthCmSepalWidthCmPetalLengthCmPetalWidthCmSpecies
015.13.51.40.2Iris-setosa
124.93.01.40.2Iris-setosa
234.73.21.30.2Iris-setosa
344.63.11.50.2Iris-setosa
455.03.61.40.2Iris-setosa
\n", "
" ], "text/plain": [ " Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species\n", "0 1 5.1 3.5 1.4 0.2 Iris-setosa\n", "1 2 4.9 3.0 1.4 0.2 Iris-setosa\n", "2 3 4.7 3.2 1.3 0.2 Iris-setosa\n", "3 4 4.6 3.1 1.5 0.2 Iris-setosa\n", "4 5 5.0 3.6 1.4 0.2 Iris-setosa" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "\n", "df = pd.read_csv('/home/pruthvish/Ganpat University/Odd_2023_2024/7_MLOPs/TheoryMaterials/Demonstrations/Docker/IRIS_Data/Iris.csv')\n", "df.head()\n" ] }, { "cell_type": "markdown", "id": "908bc1af-0053-48a5-8b56-8168be9e56a9", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Preprocessing" ] }, { "cell_type": "code", "execution_count": 2, "id": "1f88d5fc-5ec1-4e81-bbeb-fe40362404d8", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "# Dropping the Id column\n", "df.drop('Id', axis = 1, inplace = True)\n", "\n", "# Renaming the target column into numbers to aid training of the model\n", "df['Species']= df['Species'].map({'Iris-setosa':0, 'Iris-versicolor':1, 'Iris-virginica':2})\n", "\n", "# splitting the data into the columns which need to be trained(X) and the target column(y)\n", "X = df.iloc[:, :-1]\n", "y = df.iloc[:, -1]" ] }, { "cell_type": "markdown", "id": "ce8df528-6e01-4ffa-8229-4ac658f464e5", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Test-train split" ] }, { "cell_type": "code", "execution_count": 3, "id": "e72cde6c-15b2-4fcf-bd85-0c89cd386881", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "# splitting data into training and testing data with 30 % of data as testing data respectively\n", "from sklearn.model_selection import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)\n", "\n" ] }, { "cell_type": "markdown", "id": "6a0d579d-5971-4068-bddb-9157f22ae355", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Importing the random forest classifier model and training it on the dataset" ] }, { "cell_type": "code", "execution_count": 4, "id": "9f25cbbc-d651-4d66-be0f-e7b79cb04fe9", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9777777777777777\n" ] } ], "source": [ "# importing the random forest classifier model and training it on the dataset\n", "from sklearn.ensemble import RandomForestClassifier\n", "classifier = RandomForestClassifier()\n", "classifier.fit(X_train, y_train)\n", "\n", "# predicting on the test dataset\n", "y_pred = classifier.predict(X_test)\n", "\n", "# finding out the accuracy\n", "from sklearn.metrics import accuracy_score\n", "score = accuracy_score(y_test, y_pred)\n", "print(score)" ] }, { "cell_type": "markdown", "id": "ee71d454-4b82-4923-8022-bea3a94dd929", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Save the model" ] }, { "cell_type": "code", "execution_count": 5, "id": "e037790b-d71e-4455-b1da-a83969bdf272", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "import pickle\n", "pickle_out = open(\"classifier.pkl\", \"wb\")\n", "pickle.dump(classifier, pickle_out)\n", "pickle_out.close()" ] }, { "cell_type": "markdown", "id": "2ebcd066-da30-47ed-93c0-80ed487b9a1e", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "#### Deploy the model using Streamlit" ] }, { "cell_type": "markdown", "id": "0e85aa7d-b60a-4cf6-8e32-a42fef793ed4", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Task 1: Create the main function specifying the HTML view for user\n" ] }, { "cell_type": "markdown", "id": "f0c3b6af-7dd9-4e3c-af50-157f10fc737a", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "```python\n", "# this is the main function in which we define our webpage\n", "def main():\n", "\t# giving the webpage a title\n", "\tst.title(\"Iris Flower Prediction\")\n", "\t\n", "\t# here we define some of the front end elements of the web page like\n", "\t# the font and background color, the padding and the text to be displayed\n", "\thtml_temp = \"\"\"\n", "\t
\n", "\t

Streamlit Iris Flower Classifier ML App

\n", "\t
\n", "\t\"\"\"\n", "\t\n", "\t# this line allows us to display the front end aspects we have\n", "\t# defined in the above code\n", "\tst.markdown(html_temp, unsafe_allow_html = True)\n", "\t\n", "\t# the following lines create text boxes in which the user can enter\n", "\t# the data required to make the prediction\n", "\tsepal_length = st.text_input(\"Sepal Length\", \"Type Here\")\n", "\tsepal_width = st.text_input(\"Sepal Width\", \"Type Here\")\n", "\tpetal_length = st.text_input(\"Petal Length\", \"Type Here\")\n", "\tpetal_width = st.text_input(\"Petal Width\", \"Type Here\")\n", "\tresult =\"\"\n", "\t\n", "\t# the below line ensures that when the button called 'Predict' is clicked,\n", "\t# the prediction function defined above is called to make the prediction\n", "\t# and store it in the variable result\n", "\tif st.button(\"Predict\"):\n", "\t\tresult = prediction(sepal_length, sepal_width, petal_length, petal_width)\n", "\tst.success('The output is {}'.format(result))\n", "```" ] }, { "cell_type": "markdown", "id": "7419e2c4-4d20-4d45-ac3c-d613fb90468b", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Task 2: Loading the model and prediction function\n", "```python\n", "# loading in the model to predict on the data\n", "pickle_in = open('classifier.pkl', 'rb')\n", "classifier = pickle.load(pickle_in)\n", "\n", "\n", "# defining the function which will make the prediction using\n", "# the data which the user inputs\n", "def prediction(sepal_length, sepal_width, petal_length, petal_width):\n", "\n", "\tprediction = classifier.predict(\n", "\t\t[[sepal_length, sepal_width, petal_length, petal_width]])\n", "\tprint(prediction)\n", "\treturn prediction\n", "```\t" ] }, { "cell_type": "markdown", "id": "794ac99d-35d3-496d-9506-afc633bf56cf", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "##### Combining the code" ] }, { "cell_type": "markdown", "id": "53086e82-9673-416f-9cd4-84d1c554b29d", "metadata": { "editable": true, "raw_mimetype": "", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Create the new `app.py` file and save the following python code in it.\n", "\n", "```python\n", "import pandas as pd\n", "import numpy as np\n", "import pickle\n", "import streamlit as st\n", "\n", "# loading in the model to predict on the data\n", "pickle_in = open('classifier.pkl', 'rb')\n", "classifier = pickle.load(pickle_in)\n", "\n", "# defining the function which will make the prediction using\n", "# the data which the user inputs\n", "def prediction(sepal_length, sepal_width, petal_length, petal_width):\n", "\n", "\tprediction = classifier.predict(\n", "\t\t[[sepal_length, sepal_width, petal_length, petal_width]])\n", "\tprint(prediction)\n", "\treturn prediction\n", "\t\n", "\n", "# this is the main function in which we define our webpage\n", "def main():\n", "\t# giving the webpage a title\n", "\tst.title(\"Iris Flower Prediction\")\n", "\t\n", "\t# here we define some of the front end elements of the web page like\n", "\t# the font and background color, the padding and the text to be displayed\n", "\thtml_temp = \"\"\"\n", "\t
\n", "\t

Streamlit Iris Flower Classifier ML App

\n", "\t
\n", "\t\"\"\"\n", "\t\n", "\t# this line allows us to display the front end aspects we have\n", "\t# defined in the above code\n", "\tst.markdown(html_temp, unsafe_allow_html = True)\n", "\t\n", "\t# the following lines create text boxes in which the user can enter\n", "\t# the data required to make the prediction\n", "\tsepal_length = st.text_input(\"Sepal Length\", \"Type Here\")\n", "\tsepal_width = st.text_input(\"Sepal Width\", \"Type Here\")\n", "\tpetal_length = st.text_input(\"Petal Length\", \"Type Here\")\n", "\tpetal_width = st.text_input(\"Petal Width\", \"Type Here\")\n", "\tresult =\"\"\n", "\t\n", "\t# the below line ensures that when the button called 'Predict' is clicked,\n", "\t# the prediction function defined above is called to make the prediction\n", "\t# and store it in the variable result\n", "\tif st.button(\"Predict\"):\n", "\t\tresult = prediction(sepal_length, sepal_width, petal_length, petal_width)\n", "\tst.success('The output is {}'.format(result))\n", "\t\n", "if __name__=='__main__':\n", "\tmain()\n", "```" ] }, { "cell_type": "markdown", "id": "b6c4afc1-5cd9-4bbc-aabd-1657eac71cc3", "metadata": { "editable": true, "slideshow": { "slide_type": "subslide" }, "tags": [] }, "source": [ "#### Run the Streamlit server\n", "Command: \n", "```bash\n", " foo@bar:$ streamlit run app.py\n", "\n", "```" ] }, { "cell_type": "markdown", "id": "db9892a8-ced1-4fd8-a21a-8a2e68de484f", "metadata": {}, "source": [ "### Host machine learning model using streamlit and docker\n", "\n", "(Reference [link](https://docs.streamlit.io/knowledge-base/tutorials/deploy/docker))" ] }, { "cell_type": "markdown", "id": "520d30b6-cb7a-4601-a487-32b7fa1dd0d2", "metadata": {}, "source": [ "#### Step 1: Create the folder RequiredFilesForDocker. It shoud consist of following files:\n", "\n", " model.pkl: pickle file of trained model\n", " requirements.txt: a list of required libraries\n", " app.py: streamlit app script to host the model over the webpage\n", "\n" ] }, { "cell_type": "markdown", "id": "8a0b89e7-5669-4739-93ef-56ececd97b51", "metadata": {}, "source": [ "#### Step 2: Create the docker file (let say using name `dockerFile`)\n", "\n", "```dockerfile\n", "\n", "#Using the base image with Python 3.10\n", "# FROM python:3.10\n", "FROM python:3.10\n", " \n", "#Set our working directory as app\n", "WORKDIR /app \n", "\n", "COPY RequiredFilesForDocker ./\n", "\n", "#Installing Python packages through requirements.txt file\n", "RUN pip install -r requirements.txt\n", "\n", "\n", "#Exposing port 8501 from the container\n", "EXPOSE 8501\n", "#Starting the Python application\n", "#CMD [\"gunicorn\", \"--bind\", \"0.0.0.0:5000\", \"script:app\"]\n", "ENTRYPOINT [\"streamlit\", \"run\", \"app.py\", \"--server.port=8501\", \"--server.address=0.0.0.0\"]\n", "```\n" ] }, { "cell_type": "markdown", "id": "238d14e2-d5f3-41cc-b015-d2ee5492d690", "metadata": {}, "source": [ "#### Step 3: Build the docker image\n", "Run the command: `docker build -t streamlit -f dockerFile .`" ] }, { "cell_type": "markdown", "id": "d25001da-7be1-4f50-87a3-6512a60d8c3f", "metadata": {}, "source": [ "#### Step 4: Run the docker image in a container\n", "\n", "Run the command: `docker run -p 8501:8501 streamlit`" ] }, { "cell_type": "markdown", "id": "31a5d344-3410-4e0f-b268-1c98979d70fe", "metadata": {}, "source": [ "## Clone the docker image from Git Repo\n", "\n", "Let us create a docker image from following Git Repo: [https://github.com/streamlit/streamlit-example.git](https://github.com/streamlit/streamlit-example.git)\n", "\n", "#### Step 1: Create the docker file (let say using name `dockerFileFromGit`)\n", "```dockerfile\n", "# app/Dockerfile\n", "\n", "FROM python:3.9-slim\n", "\n", "WORKDIR /app\n", "\n", "RUN apt-get update && apt-get install -y \\\n", " build-essential \\\n", " curl \\\n", " software-properties-common \\\n", " git \\\n", " && rm -rf /var/lib/apt/lists/*\n", "\n", "RUN git clone https://github.com/streamlit/streamlit-example.git .\n", "\n", "RUN pip3 install -r requirements.txt\n", "\n", "EXPOSE 8501\n", "\n", "HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health\n", "\n", "ENTRYPOINT [\"streamlit\", \"run\", \"streamlit_app.py\", \"--server.port=8501\", \"--server.address=0.0.0.0\"]\n", "```" ] }, { "cell_type": "markdown", "id": "1f4b2010-543e-43dc-ac9c-81fc81dba6fe", "metadata": {}, "source": [ "#### Step 3: Build the docker image\n", "Run the command: `docker build -t streamlitFromGit -f dockerFileFromGit .`" ] }, { "cell_type": "markdown", "id": "9ac88085-7c82-46b4-91f7-66d9ac3a7247", "metadata": {}, "source": [ "#### Step 4: Run the docker image in a container\n", "\n", "Run the command: `docker run -p 8501:8501 streamlitFromGit`" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 5 }